skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Sommers, Joel"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available June 10, 2026
  2. Free, publicly-accessible full text available May 26, 2026
  3. Typosquatting—the practice of registering a domain name similar to another, usually well-known, domain name—is typically intended to drive traffic to a website for malicious or profit- driven purposes. In this paper we assess the current state of typosquatting, both broadly (across a wide variety of techniques) and deeply (using an extensive and novel dataset). Our breadth derives from the application of eight different candidate-generation techniques to a selection of the most popular domain names. Our depth derives from probing the resulting name set via a unique corpus comprising over 3.3B Domain Name System (DNS) records. We find that over 2.3M potential typosquatting names have been registered that resolve to an IP address. We then assess those names using a framework focused on identifying the intent of the domain from the perspectives of DNS and webpage clustering. Using the DNS information, HTTP responses, and Google SafeBrowsing, we classify the candidate typosquatting names as resolved to private IP, malicious, defensive, parked, legitimate, or unknown intents. Our findings provide the largest-scale and most-comprehensive perspective to date on typosquatting, exposing potential risks to users. Further, our methodology provides a blueprint for tracking and classifying typosquatting on an ongoing basis. 
    more » « less
    Free, publicly-accessible full text available May 26, 2026
  4. Typosquatting—the practice of registering a domain name similar to another, usually well-known, domain name—is typically intended to drive traffic to a website for malicious or profitdriven purposes. In this paper we assess the current state of typosquatting, both broadly (across a wide variety of techniques) and deeply (using an extensive and novel dataset). Our breadth derives from the application of eight different candidate-generation techniques to a selection of the most popular domain names. Our depth derives from probing the resulting name set via a unique corpus comprising over 3.3B Domain Name System (DNS) records. We find that over 2.3M potential typosquatting names have been registered that resolve to an IP address. We then assess those names using a framework focused on identifying the intent of the domain from the perspectives of DNS and webpage clustering. Using the DNS information, HTTP responses, and Google SafeBrowsing, we classify the candidate typosquatting names as resolved to private IP, malicious, defensive, parked, legitimate, or unknown intents. Our findings provide the largest-scale and most-comprehensive perspective to date on typosquatting, exposing potential risks to users. Further, our methodology provides a blueprint for tracking and classifying typosquatting on an ongoing basis. 
    more » « less
    Free, publicly-accessible full text available May 26, 2026
  5. Free, publicly-accessible full text available May 7, 2026
  6. The Domain Name System (DNS) is a critical piece of Internet infrastructure with remarkably complex properties and uses, and accordingly has been extensively studied. In this study we contribute to that body of work by organizing and analyzing records maintained within the DNS as a bipartite graph. We find that relating names and addresses in this way uncovers a surprisingly rich structure. In order to characterize that structure, we introduce a new graph decomposition for DNS name-to-IP mappings, which we term elemental decomposition. In particular, we argue that (approximately) decomposing this graph into bicliques — maximally connected components — exposes this rich structure. We utilize large-scale censuses of the DNS to investigate the characteristics of the resulting decomposition, and illustrate how the exposed structure sheds new light on a number of questions about how the DNS is used in practice and suggests several new directions for future research. 
    more » « less
  7. Over the past decades, active measurements have been used to gain a deep and broad understanding of routing, latency, packet loss, etc. Unfortunately, typical active measure- ments are ill-suited for elucidating the performance of individual application flows due to route changes, load balancing, transient queues, and other dynamic effects. Recent efforts have identified in-band measurement, in which probes are injected into an exist- ing application flow, as a promising approach for gaining insight into network behaviors that affect application flows. However, the use of libpcap by these efforts poses significant performance bottlenecks and is at odds with high-fidelity measurements. In this paper, we explore a new implementation pathway for in-band application flow monitoring: the extended Berkeley Packet Filter (eBPF), which enables safe programs to be run within the OS kernel. We develop an eBPF-based in-band flow monitoring tool called ELF that sends hop-limited probes within an existing flow. We compare the performance of our eBPF- based approach with the use of libpcap, finding that libpcap introduces undesirable high variability into the probe emission process. We illustrate the potential of ELF by monitoring hourly Network Diagnostic Tool (NDT) throughput measurements to 12 Measurement Lab destinations for one week. We observe that at least 90% of routers traversed by the in-band probes respond positively, with no apparent rate limiting. We examine how the hop-by-hop evolution of network queues is exposed using ELF in- band probes, illustrate the impact of mid-flow route changes, and show that load balancing may inequitably affect throughput. 
    more » « less
  8. A paper by Zhang et al. in 2001, “On the Constancy of Internet Path Properties” [1] examined the constancy of end- to-end packet loss, latency, and throughput using a modest set of hosts deployed in the Internet. In the time since that work, the Internet has changed dramatically, including the flattening of the autonomous system hierarchy and increased deployment of IPv6, among other developments. In this paper, we investigate the constancy of end-to-end Internet latency, revisiting findings of the earlier study. We use latency measurements from RIPE Atlas, choosing a set of 124 anchors with broad geographic distribution and drawn from 112 distinct autonomous systems. The earlier work of Zhang et al. relies on changepoint detection methods to identify mathematically constant time periods. We reimplement the two methods described in that earlier work and use them on the RIPE Atlas latency measurements. We also use a recently- published method (HMM-HDP) that has direct support in a RIPE Atlas API. Comparing the three changepoint detection methods, we find that the two methods used in the earlier work may miss many changepoints caused by common level-shift events. Overall, we find that the recently proposed HMM-HDP method performs substantially better. Moreover, we find that delay spikes—as defined by the earlier work—are an order of magnitude less prevalent than 20 years ago. We also find that maximum change- free regions (CFRs) along paths that we observe in today’s Internet are substantially longer than what was observed in 2001, regardless of the changepoint detection method used. In particular, the 50th percentile maximum CFR was on the order of 30 minutes in the earlier study, but our analysis reveals it to be on the order of 3 days or longer. Moreover, we find that CFR durations appear to have steadily increased over the past 5 years. 
    more » « less
  9. In this paper, we report on our investigation of how current local time is reported accurately by devices connected to the internet. We describe the basic mechanisms for time management and focus on a critical but unstudied aspect of managing time on connected devices: the time zone database (TZDB). Our longitudinal analysis of the TZDB highlights how internet time has been managed by a loose confederation of contributors over the past 25 years. We drill down on details of the update process, update types and frequency, and anomalies related to TZDB updates. We find that 76% of TZDB updates include changes to the Daylight Saving Time (DST) rules, indicating that DST has a significant influence on internet-based time keeping. We also find that about 20% of updates were published within 15 days or less from the date of effect, indicating the potential for instability in the system. We also consider the security aspects of time management and identify potential vulnerabilities. We conclude with a set of proposals for enhancing TZDB management and reducing vulnerabilities in the system. 
    more » « less